Overview

Dataset statistics

Number of variables9
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory35.3 KiB
Average record size in memory72.3 B

Variable types

NUM8
BOOL1

Warnings

Serial No. is highly correlated with uniq_IDHigh correlation
uniq_ID is highly correlated with Serial No.High correlation
uniq_ID has unique values Unique
Serial No. has unique values Unique

Reproduction

Analysis started2021-08-08 16:46:27.076985
Analysis finished2021-08-08 16:46:39.802633
Duration12.73 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

uniq_ID
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean250.5
Minimum1
Maximum500
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:39.938217image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile25.95
Q1125.75
median250.5
Q3375.25
95-th percentile475.05
Maximum500
Range499
Interquartile range (IQR)249.5

Descriptive statistics

Standard deviation144.4818328
Coefficient of variation (CV)0.5767737835
Kurtosis-1.2
Mean250.5
Median Absolute Deviation (MAD)125
Skewness0
Sum125250
Variance20875
MonotocityStrictly increasing
2021-08-08T22:16:40.133884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
50010.2%
 
17110.2%
 
15810.2%
 
15910.2%
 
16010.2%
 
16110.2%
 
16210.2%
 
16310.2%
 
16410.2%
 
16510.2%
 
Other values (490)49098.0%
 
ValueCountFrequency (%) 
110.2%
 
210.2%
 
310.2%
 
410.2%
 
510.2%
 
ValueCountFrequency (%) 
50010.2%
 
49910.2%
 
49810.2%
 
49710.2%
 
49610.2%
 

Serial No.
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean250.5
Minimum1
Maximum500
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:40.314270image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile25.95
Q1125.75
median250.5
Q3375.25
95-th percentile475.05
Maximum500
Range499
Interquartile range (IQR)249.5

Descriptive statistics

Standard deviation144.4818328
Coefficient of variation (CV)0.5767737835
Kurtosis-1.2
Mean250.5
Median Absolute Deviation (MAD)125
Skewness0
Sum125250
Variance20875
MonotocityStrictly increasing
2021-08-08T22:16:40.513546image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
50010.2%
 
17110.2%
 
15810.2%
 
15910.2%
 
16010.2%
 
16110.2%
 
16210.2%
 
16310.2%
 
16410.2%
 
16510.2%
 
Other values (490)49098.0%
 
ValueCountFrequency (%) 
110.2%
 
210.2%
 
310.2%
 
410.2%
 
510.2%
 
ValueCountFrequency (%) 
50010.2%
 
49910.2%
 
49810.2%
 
49710.2%
 
49610.2%
 

GRE Score
Real number (ℝ≥0)

Distinct49
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean316.472
Minimum290
Maximum340
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:40.694893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum290
5-th percentile298
Q1308
median317
Q3325
95-th percentile335
Maximum340
Range50
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.29514837
Coefficient of variation (CV)0.03569083007
Kurtosis-0.7110644626
Mean316.472
Median Absolute Deviation (MAD)8
Skewness-0.03984185809
Sum158236
Variance127.5803768
MonotocityNot monotonic
2021-08-08T22:16:40.898195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%) 
312244.8%
 
324234.6%
 
316183.6%
 
321173.4%
 
322173.4%
 
327173.4%
 
314163.2%
 
311163.2%
 
320163.2%
 
317153.0%
 
Other values (39)32164.2%
 
ValueCountFrequency (%) 
29020.4%
 
29310.2%
 
29420.4%
 
29551.0%
 
29651.0%
 
ValueCountFrequency (%) 
34091.8%
 
33930.6%
 
33840.8%
 
33720.4%
 
33651.0%
 

TOEFL Score
Real number (ℝ≥0)

Distinct29
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean107.192
Minimum92
Maximum120
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:41.062544image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum92
5-th percentile98
Q1103
median107
Q3112
95-th percentile118
Maximum120
Range28
Interquartile range (IQR)9

Descriptive statistics

Standard deviation6.08186766
Coefficient of variation (CV)0.05673807429
Kurtosis-0.6532454042
Mean107.192
Median Absolute Deviation (MAD)5
Skewness0.09560097236
Sum53596
Variance36.98911423
MonotocityNot monotonic
2021-08-08T22:16:41.218278image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%) 
110448.8%
 
105377.4%
 
104295.8%
 
112285.6%
 
107285.6%
 
106285.6%
 
103255.0%
 
102244.8%
 
100244.8%
 
99234.6%
 
Other values (19)21042.0%
 
ValueCountFrequency (%) 
9210.2%
 
9320.4%
 
9420.4%
 
9530.6%
 
9661.2%
 
ValueCountFrequency (%) 
12091.8%
 
119102.0%
 
118102.0%
 
11781.6%
 
116163.2%
 

University Rating
Real number (ℝ≥0)

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.114
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:41.421557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.143511801
Coefficient of variation (CV)0.3672163779
Kurtosis-0.8100796635
Mean3.114
Median Absolute Deviation (MAD)1
Skewness0.09029498313
Sum1557
Variance1.307619238
MonotocityNot monotonic
2021-08-08T22:16:41.620698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
316232.4%
 
212625.2%
 
410521.0%
 
57314.6%
 
1346.8%
 
ValueCountFrequency (%) 
1346.8%
 
212625.2%
 
316232.4%
 
410521.0%
 
57314.6%
 
ValueCountFrequency (%) 
57314.6%
 
410521.0%
 
316232.4%
 
212625.2%
 
1346.8%
 

SOP
Real number (ℝ≥0)

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.374
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:41.808969image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.5
Q12.5
median3.5
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation0.9910036208
Coefficient of variation (CV)0.2937177299
Kurtosis-0.7057169536
Mean3.374
Median Absolute Deviation (MAD)0.5
Skewness-0.2289723963
Sum1687
Variance0.9820881764
MonotocityNot monotonic
2021-08-08T22:16:42.111383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
48917.8%
 
3.58817.6%
 
38016.0%
 
2.56412.8%
 
4.56312.6%
 
2438.6%
 
5428.4%
 
1.5255.0%
 
161.2%
 
ValueCountFrequency (%) 
161.2%
 
1.5255.0%
 
2438.6%
 
2.56412.8%
 
38016.0%
 
ValueCountFrequency (%) 
5428.4%
 
4.56312.6%
 
48917.8%
 
3.58817.6%
 
38016.0%
 

LOR
Real number (ℝ≥0)

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.484
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:42.320462image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3.5
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9254495739
Coefficient of variation (CV)0.2656284655
Kurtosis-0.7457485106
Mean3.484
Median Absolute Deviation (MAD)0.5
Skewness-0.1452903146
Sum1742
Variance0.8564569138
MonotocityNot monotonic
2021-08-08T22:16:42.555890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
39919.8%
 
49418.8%
 
3.58617.2%
 
4.56312.6%
 
55010.0%
 
2.55010.0%
 
2469.2%
 
1.5112.2%
 
110.2%
 
ValueCountFrequency (%) 
110.2%
 
1.5112.2%
 
2469.2%
 
2.55010.0%
 
39919.8%
 
ValueCountFrequency (%) 
55010.0%
 
4.56312.6%
 
49418.8%
 
3.58617.2%
 
39919.8%
 

CGPA
Real number (ℝ≥0)

Distinct184
Distinct (%)36.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.57644
Minimum6.8
Maximum9.92
Zeros0
Zeros (%)0.0%
Memory size3.9 KiB
2021-08-08T22:16:42.744862image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum6.8
5-th percentile7.638
Q18.1275
median8.56
Q39.04
95-th percentile9.6
Maximum9.92
Range3.12
Interquartile range (IQR)0.9125

Descriptive statistics

Standard deviation0.6048128003
Coefficient of variation (CV)0.07052026253
Kurtosis-0.5612783981
Mean8.57644
Median Absolute Deviation (MAD)0.46
Skewness-0.02661251732
Sum4288.22
Variance0.3657985234
MonotocityNot monotonic
2021-08-08T22:16:42.941171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
891.8%
 
8.7691.8%
 
8.5471.4%
 
8.4571.4%
 
8.5671.4%
 
8.1271.4%
 
7.8861.2%
 
8.6461.2%
 
8.6661.2%
 
9.1161.2%
 
Other values (174)43086.0%
 
ValueCountFrequency (%) 
6.810.2%
 
7.210.2%
 
7.2110.2%
 
7.2310.2%
 
7.2510.2%
 
ValueCountFrequency (%) 
9.9210.2%
 
9.9110.2%
 
9.8720.4%
 
9.8610.2%
 
9.8210.2%
 

Research
Boolean

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
1
280 
0
220 
ValueCountFrequency (%) 
128056.0%
 
022044.0%
 
2021-08-08T22:16:43.068305image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Interactions

2021-08-08T22:16:28.970953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:29.139190image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:29.284133image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:29.428819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:29.586328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:29.788043image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:29.975110image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:30.143276image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:30.359989image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:30.526249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:30.691821image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:30.908805image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:31.150890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:31.365034image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:31.602891image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:31.755025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:31.885882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:32.015836image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:32.210248image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:32.396842image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:32.587350image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:32.952977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:33.169328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:33.332786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:33.469239image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:33.597682image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:33.725193image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:33.853167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:33.983944image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:34.121899image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:34.260342image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:34.400175image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:34.538434image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:34.676228image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:34.817720image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:34.957800image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:35.103013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:35.254054image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:35.407396image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:35.558574image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:35.704037image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:35.843586image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:35.986811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:36.125306image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:36.269674image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:36.418240image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:36.567867image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:36.716746image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:36.857252image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:36.995792image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:37.136174image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:37.277774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:37.417076image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:37.576297image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:37.725936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:37.876721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:38.020557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:38.157343image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:38.291619image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:38.427271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:38.750007image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:38.914045image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:39.065915image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:39.217402image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-08-08T22:16:43.218568image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-08T22:16:43.472255image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-08T22:16:43.710343image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-08T22:16:44.018456image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-08-08T22:16:39.506632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-08T22:16:39.721574image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Sample

First rows

uniq_IDSerial No.GRE ScoreTOEFL ScoreUniversity RatingSOPLORCGPAResearch
01133711844.54.59.651
12232410744.04.58.871
23331610433.03.58.001
34432211033.52.58.671
45531410322.03.08.210
56633011554.53.09.341
67732110933.04.08.201
78830810123.04.07.900
89930210212.01.58.000
9101032310833.53.08.600

Last rows

uniq_IDSerial No.GRE ScoreTOEFL ScoreUniversity RatingSOPLORCGPAResearch
49049149130710522.54.58.121
4914924922979943.03.57.810
49249349329810142.54.57.691
4934944943009523.01.58.221
4944954953019932.52.08.451
49549649633210854.54.09.021
49649749733711755.05.09.871
49749849833012054.55.09.561
49849949931210344.05.08.430
49950050032711344.54.59.040